A geometric framework for outlier detection in high‐dimensional data

نویسندگان

چکیده

Outlier or anomaly detection is an important task in data analysis. We discuss the problem from a geometrical perspective and provide framework which exploits metric structure of set. Our approach rests on manifold assumption, that is, observed, nominally high-dimensional lie much lower dimensional this intrinsic can be inferred with learning methods. show exploiting significantly improves outlying observations high data. also suggest novel, mathematically precise widely applicable distinction between distributional structural outliers based geometry topology clarifies conceptual ambiguities prevalent throughout literature. experiments focus functional as one class structured data, but we propose completely general include image graph applications. results outlier non-tabular detected visualized using methods quantified standard scoring applied to embedding vectors. This article categorized under: Technologies > Structure Discovery Clustering Fundamental Concepts Data Knowledge Visualization

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Framework for Outlier Detection in Geographic Spatial Data

Outlier detection is very interesting, useful and challenging problem in the field of data mining. Because of sparse data clustering algorithm which are based on distance will not work to find outliers in spatial data. Problem of finding irregular feature in spatial data need to be explore. Many existing approaches have been proposed to overcome the problem of outlier detection in spatial Geogr...

متن کامل

A statistical test for outlier identification in data envelopment analysis

In the use of peer group data to assess individual, typical or best practice performance, the effective detection of outliers is critical for achieving useful results. In these ‘‘deterministic’’ frontier models, statistical theory is now mostly available. This paper deals with the statistical pared sample method and its capability of detecting outliers in data envelopment analysis. In the prese...

متن کامل

A Unified Subspace Outlier Ensemble Framework for Outlier Detection

$EVWUDFW 7KH WDVN RI RXWOLHU GHWHFWLRQ LV WR ILQG VPDOO JURXSV RI GDWD REMHFWV WKDW DUH H[FHSWLRQDO ZKHQ FRPSDUHG ZLWK UHVW ODUJH DPRXQW RI GDWD 'HWHFWLRQ RI VXFK RXWOLHUV LV LPSRUWDQW IRU PDQ\ DSSOLFDWLRQV VXFK DV IUDXG GHWHFWLRQ DQG FXVWRPHU PLJUDWLRQ 0RVW VXFK DSSOLFDWLRQV DUH KLJK GLPHQVLRQDO GRPDLQV LQ ZKLFK WKH GDWD PD\ FRQWDLQ KXQGUHGV RI GLPHQVLRQV +RZHYHU WKH RXWOLHU GHWHFWLRQ SUREOHP ...

متن کامل

Outlier detection for skewed data

Most outlier detection rules for multivariate data are based on the assumption of elliptical symmetry of the underlying distribution. We propose an outlier detection method which does not need the assumption of symmetry and does not rely on visual inspection. Our method is a generalization of the Stahel-Donoho outlyingness. The latter approach assigns to each observation a measure of outlyingne...

متن کامل

a framework for identifying and prioritizing factors affecting customers’ online shopping behavior in iran

the purpose of this study is identifying effective factors which make customers shop online in iran and investigating the importance of discovered factors in online customers’ decision. in the identifying phase, to discover the factors affecting online shopping behavior of customers in iran, the derived reference model summarizing antecedents of online shopping proposed by change et al. was us...

15 صفحه اول

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery

سال: 2023

ISSN: ['1942-4787', '1942-4795']

DOI: https://doi.org/10.1002/widm.1491